Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition

نویسندگان

چکیده

Recognizing irregular text in natural images is a challenging task computer vision. The existing approaches still face difficulties recognizing because of its diverse shapes. In this paper, we propose simple yet powerful recognition framework based on an encoder-decoder architecture. proposed divided into four main modules. Firstly, the image transformation module, Thin Plate Spline (TPS) employed to transform readable image. Secondly, novel Spatial Attention Module (SAM) compel model concentrate regions and obtain enriched feature maps. Thirdly, deep bi-directional long short-term memory (Bi-LSTM) network used make contextual map out visual generated from Convolutional Neural Network (CNN). Finally, Dual Step Mechanism (DSAM) integrated with Connectionist Temporal Classification (CTC) - decoder re-weights features focus intra-sequence relationships generate more accurate character sequence. effectiveness our verified through extensive experiments various benchmarks datasets, such as SVT, ICDAR, CUTE80, IIIT5k. performance analyzed accuracy metric. Demonstrate that method outperforms both regular text. Additionally, robustness approach evaluated using grocery GroZi-120, WebMarket, SKU-110K, Freiburg Groceries datasets contain complex images. Still, produces superior datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Video Summarization with Attention-Based Encoder-Decoder Networks

This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, the output is a keyshot sequence. Our key idea is to learn a deep summarization network with attention mechanism to mimic the way of selecting the keyshots of human. To this end, we propose a novel video summariz...

متن کامل

Japanese Text Normalization with Encoder-Decoder Model

Text normalization is the task of transforming lexical variants to their canonical forms. We model the problem of text normalization as a character-level sequence to sequence learning problem and present a neural encoder-decoder model for solving it. To train the encoder-decoder model, many sentences pairs are generally required. However, Japanese non-standard canonical pairs are scarce in the ...

متن کامل

SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network

A new approach for real-time scene text recognition is proposed in this paper. A novel binary convolutional encoderdecoder network (B-CEDNet) together with a bidirectional recurrent neural network (Bi-RNN). The B-CEDNet is engaged as a visual front-end to provide elaborated character detection, and a back-end Bi-RNN performs characterlevel sequential correction and classification based on learn...

متن کامل

Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model

Neural machine translation has shown very promising results lately. Most NMT models follow the encoder-decoder framework. To make encoder-decoder models more flexible, attention mechanism was introduced to machine translation and also other tasks like speech recognition and image captioning. We observe that the quality of translation by attention-based encoder-decoder can be significantly damag...

متن کامل

An Encoder-Decoder Framework Translating Natural Language to Database Queries

Machine translation is going through a radical revolution, driven by the explosive development of deep learning techniques using Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In this paper, we consider a special case in machine translation problems, targeting to translate natural language into Structural Query Language (SQL) for data retrieval over relational database. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Intelligent Automation and Soft Computing

سال: 2023

ISSN: ['2326-005X', '1079-8587']

DOI: https://doi.org/10.32604/iasc.2023.029105